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1 / 

SEQUENCE LISTING 

(1) GENERAL INFOE^MATION | 

5 I 
(i) APPLICANT: BRUCK, CLAUDINi: 

(ii) TITLE OF THE INVENTION: VACCINE 



10 



50 



60 



(iii) NUMBER OF SEQUENCES: 23 



(iv) CORRESPONDENCE ADDRESS; 

(A) ADDRESSEE: SmithKiine Bejecham 
15 (B) STREET; 2 New Horizons Cpurt, Great West Road, B 

(C) CITY: Middx 

(D) STATE: 

(E) COUNTRY: UK 

(F) ZIP: TW8 9EP 

20 

(V) COMPUTER READABLE FORM: 

(A) MEDIUM TYPE: Diskette 

(B) COMPUTER: IBM Compatibl 

(C) OPERATING SYSTEM: DOS 
25 (D) SOFTWARE: FastSEQ for Windows Version 2.0 

(vi) CURRENT APPLICATION DAt| 

(A) APPLICATION NUMBER: 

(B) FILING DATE: 
30 (C) CLASSIFICATION: 

(vii) PRIOR APPLICATION DAI^f^ 

(A) APPLICATION NUMBER: 

(B) FILING DATE: 

35 

(viii) ATTORNEY /AGENT INF<|rMATION : 
(A) NAME: Dalton, Marc|s J 

40 (B) REGISTRATION NUMBErI 

(C) REFERENCE/ DOCKET NUMBER: B4 512 4 

(ix) TELECOMMUNICATION INFORMATION: 
(A) TELEPHONE: 0181 97f6348 

45 (B) TELEFAX: 0181 9756|77 

(C) TELEX: 



(2) INFORMATION FOR |eQ ID N0:1: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 220 ami no| acids 

(B) TYPE: amino acid | 

(C) STRANDEDNESS: sir|gle 
55 (D) TOPOLOGY: linear | 

Protein D 1/3 E7 His 



(xi) SEQUENCE DESCRIPtIoN: SEQ ID NO : 1 : 



Met Asp Pro Ser Ser His Ser fser Asn Met Ala Asn Thr Gin Met Lys 

1 5 I 10 15 

Ser Asp Lys He He He Ala inis Arg Gly Ala Ser Gly Tyr Leu Pro 

20 1 25 30 

Glu His Thr Leu Glu Ser LyslAla Leu Ala Phe Ala Gin Gin Ala Asp 
65 35 |40 45 

Tyr Leu Glu Gin Asp Leu Ala|Met Thr Lys Asp Gly Arg Leu Val Val 
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10 



15 



60 





50 










55 










60 










lie 


His 


Asp 


His 


Phe 


Leu 


Asp 


Gly 


Leu Thr 


ASD 


Vai 


Aia 


Lys 


Lys 


Phe 


65 








70 










75' 










80 


Pro 


His 


Arg 


His 


Arg 


Lys 


Asp 


Gly 


Arg Tiyr 


Tyr 


Val 


lie 


Asp 


Phe 


Thr 








85 








90 










95 




Leu 


Lys 


Giu 


lie 


Gin 


Ser 


Leu 


Giu 


Met Thr 


Giu 


Asn 


Phe 


Giu 


Thr 


Met 






100 










105 


i. 








110 






Aia 


Met 


His 


Gly 


Asp 


Thr 


Pro 


Thr 


Leu 


^is 


Giu 


Tyr 


Met 


Leu 


Asp 


Leu 






115 






120 




1 






125 








Gin 


Pro 
130 


Giu 


Thr 


Thr 


Asp 


Leu 
135 


Tyr 


Cys 


Tyr 


Giu 


Gin 
140 


Leu 


Asn 


Asp 


Ser 


Ser 


Giu 


Giu 


Giu 


Asp 


Giu 


lie 


Asp 


Gly 


Pro 


Ala 


Gly 


Gin 


Aia 


Giu 


Pro 


145 








150 








1 


155 










160 


Asp 


Arg 


Aia 


His 


Tyr 


Asn 


lie 


Vai 


Thr 


Phe 


Cys 


Cys 


Lys 


Cys 


Asp 


Ser 






165 










;i7 0 










175 




Thr 


Leu 


Arg 


Leu 


Cys 


Val 


Gin 


Ser 


Thr 


His 


Vai 


Asp 


lie 


Arg 


Thr 


Leu 






180 










185 










190 






Giu 


Asp 


Leu 


Leu 


Met 


Gly 


Thr 


Leu 


Gly 


lie 


Val 


Cys 


Pro 


lie 


Cys 


Ser 




195 










200 










205 








Gin 


Lys 
210 


Pro 


Thr 


Ser 


Gly 


His 
215 


His 


His j 

i 


His 


His 


His 
220 











20 

210 

(2) INFORMATION FOR SEQ ID|N0:2: 

25 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 663 base pairs 

(B) TYPE: nucleic acid 

(C) STEU^NDEDNESS : single 

(D) TOPOLOGY: linear | 
30 Protein D 1/3 E7 his I 

I 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 2: 

I 

ATGGATCCAA GCAGCCATTC ATCAAATATG qCGAATACCC AAATGAAATC AGACAAAATC 

35 60 I 

ATTATTGCTC ACCGTGGTGC TAGCGGTTAT TTACCAGAGC ATACGTTAGA ATCTAAAGCA 

120 I 
CTTGCGTTTG CACAACAGGC TGATTATTTA GAGCAAGATT TAGCAATGAC TAAGGATGGT 

180 I 
40 CGTTTAGTGG TTATTCACGA TCACTTTTTA /GATGGCTTGA CTGATGTTGC GAAAAAATTC 

240 I 
CCACATCGTC ATCGTAAAGA TGGCCGTTAC ItATGTCATCG ACTTTACCTT AAAAGAAATT 

300 I 
CAAAGTTTAG AAATGACAGA aaactttgaaI ACCATGGCCA TGCATGGAGA tacacctaca 

45 360 I 

ttgcatgaat atatgttaga tttgcaacca gagacaactg atctctactg ttatgagcaa 

420 I 

ttaaatgaca gctcagagga ggaggatgaa atagatggtc cagctggaca agcagaaccg 

480 I 
50 GACAGAGCCC ATTACAATAT TGTAACCTm tgttgcaagt gtgactctac gcttcggttg 

540 I 

tgcgtacaaa gcacacacgt agacattcgt actttggaag acctgttaat gggcacacta 

600 / 

ggaattgtgt gccccatctg ttctcag;^ ccaactagtg gccaccatca ccatcaccat 

55 660 



taa 

663 



(2) INFORMATION FOR/ SEQ ID NO : 3 : 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 822 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: sp.ngie 
65 (D) TOPOLOGY: linear 

Protein D 1/3 E6 His/HPV 16 

f 
) 
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(xi) SEQUENCE DESCRIPTION SEQ ID NO:3: 

ATGGATCCAA GCAGCCATTC ATCAAATATG GCGAATACCC AAATGAAATC AGACAAAATC 
5 60 / 

ATTATTGCTC ACCGTGGTGC TAGCGGTTAT TTACCAGAGC ATACGTTAGA ATCTAAAGCA 
120 I 

CTTGCGTTTG CACAACAGGC TGATTATTTA GAGCAAGATT TAGCAATGAC TAAGGATGGT 
180 / 
10 CGTTTAGTGG TTATTCACGA TCACTTTTTA GATGGGTTGA CTGATGTTGC GAAAAAATTC 
240 j 

CCACATCGTC ATCGTAAAGA TGGCCGTTAC TATGTGATCG ACTTTACCTT AAAAGAAATT 
300 f 
CAAAGTTTAG AAATGACAGA AAACTTTGAA ACCATGGCCA TGTTTCAGGA CCCACAGGAG 
15 360 I 

CGACCCAGAA AGTTACCACA GTTATGCACA GAGCTGCAAA CAACTATACA TGATATAATA 
420 I 

TTAGAATGTG TGTACTGCAA GCAACAGTTA CTGCGACGTG AGGTATATGA CTTTGCTTTT 
480 I 
20 CGGGATTTAT GCATAGTATA TAGAGATGGG AATCCATATG CTGTATGTGA TAAATGTTTA 
540 f 

AAGTTTTATT CTAAAATTAG TGAGTATAGA CATTATTGTT ATAGTTTGTA TGGAACAACA 
600 I 
TTAGAACAGC AATACAACAA ACCGTTGTGT GATT|rGTTAA TTAGGTGTAT TAACTGTCAA 
25 660 I 

AAGCCACTGT GTCCTGAAGA AAAGCAAAGA CATGTGGACA AAAAGCAAAG ATTCCATAAT 
720 I 

ATAAGGGGTC GGTGGACCGG TCGATGTATG TCTTGTTGCA GATCATCAAG AACACGTAGA 
780 I 
30 GAAACCCAGC TGACTAGTGG CCACCATCAC CATCACCATT AA 
822 



(2) I^lFOR^4ATIO^J for SEQ ID |N0:4: 

35. (i) SEQUENCE CHARACTERISTICS:! 

(A) LENGTH: 27 4 amino acids | 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 
40 Protein D 1/3 E6 His/HPV 16 | 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 4 



Met Asp Pro Ser Ser His Ser Ser Asn Met Ala Asn Thr Gin Met Lys 
45 1 5 I 10 15 

Ser Asp Lys lie lie lie Ala His Arg Gly Ala Ser Gly Tyr Leu Pro 

20 25| 30 

Glu His Thr Leu Glu Ser Lys Ala Leu Ala Phe Ala Gin Gin Ala Asp 
35 40 I 45 

50 Tyr Leu Glu Gin Asp Leu Ala Met Thr Lys Asp Gly Arg Leu Val Val 
50 55 I 60 

lie His Asp His Phe Leu Asp Gly Leu Thr Asp Val Ala Lys Lys Phe 

65 70 I 75 80 

Pro His Arg His Arg Lys Asd Gly Arp Tyr Tyr Val He Asp Phe Thr 
55 85 I 90 95 

Leu Lys Glu He Gin Ser Leu Glu Met Thr Glu Asn Phe Glu Thr Met 

100 lois 110 

Ala Met Phe Gin Aso Pro Gin Glu Ang Pro Arg Lys Leu Pro Gin Leu 
115 ' 120 j 125 

60 Cys Thr Glu Leu Gin Thr Thr He Hi|s Asp He He Leu Glu Cys Val 
130 135 I 140 

Tyr Cys Lys Gin Gin Leu Leu Arg Arg Glu Val Tyr Asp Phe Ala Phe 
145 150 \ 155 160 

Arg Asp Leu Cys He Val Tyr Arg Asp Gly Asn Pro Tyr Ala Val Cys 
65 165 1 170 175 

Asp Lys Cys Leu Lys Phe Tyr Ser Lys He Ser Glu Tyr Arg His Tyr 
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180 135 190 

Cys Tvr Ser Leu Tyr Gly Thr Thr Leu Giu Gin Gin Tyr Asn Lys Pro 

195 200 i 205 

Leu Cys Asp Leu Leu lie Arg Cys lie Asn Cys Gin Lys Pro Leu Cys 
5 210 215 I 220 

Pro Giu Giu Lys Gin Arg His Leu Asp Lys Lys Gin Arg Phe His Asn 
225 230 I 235 240 

lie Arg Giy Arg Trp Thr Giy Arg Cys Met Ser Cys Cys Arg Ser Ser 
245 2S0 255 

10 Arg Thr Arg Arg Giu Thr Gin Leu Thr SeV Giy His His His His His 
260 265 \ 270 

His 



15 (2) INFORMATION FOR SEQ ID NO 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1116 base pairs 

(B) TYPE: nucleic acid 
20 (C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

Protein D 1/3 E6/E7/ HPV16 



25 



5: 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 5 : 



ATGGATCCAA GCAGCCATTC ATCAAATATG GCGAATKCCC AAATGAAATC AGACAAAATC 

60 I 
ATTATTGCTC ACCGTGGTGC TAGCGGTTAT TTACCAGAGC ATACGTTAGA ATCTAAAGCA 

120 I 
30 CTTGCGTTTG CACAACAGGC TGATTATTTA GAGCAAGATT TAGCAATGAC taaggatggt 

180 1 
CGTTTAGTGG TTATTCACGA TCACTTTTTA GATGGCfTGA CTGATGTTGC GAAAAAATTC 

240 1 
CCACATCGTC ATCGTAAAGA TGGCCGTTAC TATGTC|iTCG ACTTTACCTT AAAAGAAATT 

35 300 i 

CAAAGTTTAG AAATGACAGA AAACTTTGAA ACCATGGCCA TGTTTCAGGA CCCACAGGAG 

360 1 
CGACCCAGAA AGTTACCACA GTTATGCACA GAGCTG|AAA CAACTATACA TGATATAATA 

420 i 
40 TTAGAATGTG TGTACTGCAA GCAACAGTTA CTGCGAfGTG AGGTATATGA CTTTGCTTTT 

480 1 
CGGGATTTAT GCATAGTATA TAGAGATGGG AATCCATATG CTGTATGTGA TAAATGTTTA 

54 0 1 
AAGTTTTATT CTAAAATTAG TGAGTATAGA CATTATTGTT ATAGTTTGTA TGGAACAACA 

45 600 i 

TTAGAACAGC AATACAACAA ACCGTTGTGT GATTTGTTAA TTAGGTGTAT TAACTGTCAA 

660 I 
AAGCCACTGT GTCCTGAAGA AAAGCAAAGA CATCTG^ACA AAAAGCAAAG ATTCCATAAT 

720 ^ 
50 ATAAGGGGTC GGTGGACCGG TCGATGTATG TCTTCT^TGCA GATCATCAAG AACACGTAGA 

780 f 
GAAACCCAGC TGATGCATGG AGATACACCT ACATTC|CATG AATATATGTT AGATTTGCAA 

840 S 
CCAGAGACAA CTGATCTCTA CTGTTATGAG CAATTAAATG ACAGCTCAGA GGAGGAGGAT 

55 900 I 

GAAATAGATG GTCCAGCTGG ACAAGCAGAA CCGGA|aGAG CCCATTACAA TATTGTAACC 

960 i 
TTTTGTTGCA AGTGTGACTC TACGCTTCGG TTGTGfGTAC AAAGCACACA CGTAGACATT 

1020 I 
60 CGTACTTTGG AAGACCTGTT AATGGGCACA CTAGG|lATTG TGTGCCCCAT CTGTTCTCAG 

1080 

aaaccaacta gtggccacca tcaccatcac catt; 

1116 

65 (2) INFORMATION FOR SEQ ID N0| 6 : 
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10 



15 



20 



25 



30 



35 



40 



45 



50 



55 



60 



65 
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(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 372 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 
Protein D 1/3 E6/E7/ HPV16 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 6: 

Met Asp Pro Ser Ser His Ser Ser Asn Met Ala Asn Thr 

1 5 loj 

Ser Asp Lys lie lie He Ala His Arg Gljy Ala Ser Gly 

20 25 
Glu His Thr Leu Glu Ser Lys Ala Leu Al|a Phe Ala Gin 

35 40 I 45 

Tyr Leu Glu Gin Asp Leu Ala Met Thr Lyp Asp Gly Arg 

50 55 I 60 

He His Asp His Phe Leu Asp Gly Leu ThV Asp Val Ala 
65 70 I 75 

Pro His Arg His Arg Lys Asp Gly Arg T^r Tyr Val He 

85 90 
Leu Lys Glu He Gin Ser Leu Glu Met Thr Glu Asn Phe 

100 105 I 

Ala Met Phe Gin Asd Pro Gin Glu Arg Pro Arg Lys Leu 
115 ^ 120 I 125 

Cys Thr Glu Leu Gin Thr Thr He His Asp He He Leu 

130 135 I 140 

Tyr Cys Lys Gin Gin Leu Leu Arg Arg Glu Val Tyr Asp 
145 150 f 155 

Arg Asp Leu Cys He Val Tyr Arg Asp GUy Asn Pro Tyr 

ll70 



165 ^1'- 
Asp Lys Cys Leu Lys Phe Tyr Ser Lys He Ser Glu Tyr 

180 185 I 

Cys Tyr Ser Leu Tyr Gly Thr Thr Leu Glu Gin Gin Tyr 
195 200 I 205 

Leu Cys Asp Leu Leu He Arg Cys He Asn Cys Gin Lys 

210 215 I 220 

Pro Glu Glu Lys Gin Arg His Leu Asp Lys Lys Gin Arg 
225 230 f 235 

He Arg Gly Arg Trp Thr Gly Arg Cys Met Ser Cys Cys 

245 ^50 
Arg Thr Arg Arg Glu Thr Gin Leu Met pis Gly Asp Thr 

260 265 I 

His Glu Tyr Met Leu Asp Leu Gin Pro |Glu Thr Thr Asp 
275 280 I 285 

Tyr Glu Gin Leu Asn Asp Ser Ser Giu ^^Glu Glu Asp Glu 

290 295 i 300 

Pro Ala Gly Gin Ala Glu Pro Asp Arg |Ala His Tyr Asn 
305 310 I 315 

Phe Cys Cys Lys Cys Asp Ser Thr LeufArg Leu Cys Val 

325 P30 
His Val Asp He Arg Thr Leu Glu Asp. Leu Leu Met Gly 

340 345^. 
He Val Cys Pro He Cys Ser Gin Lys* Pro Thr Ser Gly 
355 360 f: 365 

His His His f 
370 * 

(2) INFORMATION FOR SEQ ID NO : 7 : 

(i) SEQUENCE CHARACTERISTICS^: 

(A) LENGTH: 663 base pairs; 

(B) TYPE: nucleic acid | 
CO STRANDEDNESS: single ; 
CD) TOPOLOGY: linear 

Protein D 1/3 E7 mutated HPV 16 



Gin Met Lys 
15 

Tyr Leu Pro 
30 

Gin Ala Asp 

Leu Val Val 

Lys Lys Phe 
80 

Asp Phe Thr 
95 

Glu Thr Met 
110 

Pro Gin Leu 

Glu Cys Val 

Phe Ala Phe 
160 

Ala Val Cys 
175 

Arg His Tyr 
190 

Asn Lys Pro 



Pro Leu Cys 

Phe His Asn 
240 

Arg Ser Ser 
255 

Pro Thr Leu 
270 

Leu Tyr Cys 

He Asp Gly 

He Val Thr 
320 

Gin Ser Thr 
335 

Thr Leu Gly 
350 

His His His 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 7: 

ATGGATCCAA GCAGCCATTC ATCAAATATG GCGAkxACCC AAATGAAATC AGACAAAATC 
5 60 I 

ATTATTGCTC ACCGTGGTGC TAGCGGTTAT TTAGCAGAGC ATACGTTAGA ATCTAAAGCA 
120 f 

CTTGCGTTTG CACAACAGGC TGATTATTTA GAGCAAGATT TAGCAATGAC TAAGGATGGT 
180 f 
10 CGTTTAGTGG TTATTCACGA TCACTTTTTA GAT^GGCTTGA CTGATGTTGC GAAAAAATTC 

240 t; 

CCACATCGTC ATCGTAAAGA TGGCCGTTAC TATGTCATCG ACTTTACCTT AAAAGAAATT 
300 f 

CAAAGTTTAG AAATGACAGA AAACTTTGAA ACCATGGCCA TGCATGGAGA TACACCTACA 
15 360 I 

TTGCATGAAT ATATGTTAGA TTTGCAACCA GAGACAACTG ATCTCTACGG TTATCAGCAA 
420 I 

TTAAATGACA GCTCAGAGGA GGAGGATGAA ATAGATGGTC CAGCTGGACA AGCAGAACCG 
480 I 
20 GACAGAGCCC ATTACAATAT TGTAACCTTT TG^TGCAAGT GTGACTCTAC GCTTCGGTTG 
540 I 

TGCGTACAAA GCACACACGT AGACATTCGT AGTTTGGAAG ACCTGTTAAT GGGCACACTA 
600 I 

GGAATTGTGT GCCCCATCTG TTCTCAGAAA CCAACTAGTG GCCACCATCA CCATCACCAT 
25 660 I 

TAA I 
663 

30 



(2) INFORMATION FOR SEQ iD NO: 8 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 220 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 
35 (D) TOPOLOGY: linear 

Protein D 1/3 E7 mutated HPV 16 
(xi) SEQUENCE DESCRIPTION : f; SEQ ID N0:8: 



40 Met AsD Pro Ser Ser His Ser Ser f'Asn Met Ala Asn Thr Gin Met Lys 
1 ' 5 I 10 15 

Ser AsD Lys lie lie He Ala His lArg Gly Ala Ser Gly Tyr Leu Pro 

20 |j25 30 

Glu His Thr Leu Glu Ser Lys AlafLeu Ala Phe Ala Gin Gin Ala Asp 
45 35 40 I 45 

Tyr Leu Glu Gin Asp Leu Ala Metf Thr Lys Asp Gly Arg Leu Val Val 

50 55 I 60 

He His Asp His Phe Leu Asp GIU Leu Thr Asp Val Ala Lys Lys Phe 
65 70 r "^5 80 

50 Pro His Arg His Arg Lys Asp Gli Arg Tyr Tyr Val He Asp Phe Thr 

85 I 90 95 

Leu Lys Glu He Gin Ser Leu Glu Met Thr Glu Asn Phe Glu Thr Met 

100 i 105 110 

Ala Met His Gly Asp Thr Pro ThS: Leu His Glu Tyr Met Leu Asp Leu 
55 115 1^0 125 

Gin Pro Glu Thr Thr Asp Leu Tyr Gly Tyr Gin Gin Leu Asn Asp Ser 

130 135 I 140 

Ser Glu Glu Glu Asp Glu He Asp Gly Pro Ala Gly Gin Ala Glu Pro 
145 150 f. 155 160 

60 Asp Arg Ala His Tyr Asn He Val Thr Phe Cys Cys Lys Cys Asp Ser 

165 I 170 175 

Thr Leu Arg Leu Cys Val Gin Ser Thr His Val Asp He Arg Thr Leu 

180 I 185 190 

Glu AsD Leu Leu Met Gly Thr Eeu Gly He Val Cys Pro He Cys Ser 
65 ' 195 200 205 

Gin Lys Pro Thr Ser Gly His |is His His His His 
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15 



20 



25 



30 



35 



40 



45 



50 



55 



60 



65 



215 

(2) INFORMATION FOR SEQ ID NO : 9 



PCT/EP98/08563 



220 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 879 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 
CLYTA E6 His HPV 16 

(xi) SEQUENCE DESCRIPTION: SEQ IC 



NO: 9: 



ATGAAAGGGG GAATTGTACA TTCAGACGGC TCTTATCCAA AAGACAAGTT TGAGAAAATC 

60 j 
AATGGCACTT GGTACTACTT TGACAGTTCA GGCTATATGC TTGCAGACCG CTGGAGGAAG 

120 1 

CACACAGACG GCAACTGGTA CTGGTTCGAC AACTCAGGCG AAATGGCTAC AGGCTGGAAG 
180 1 

AAAATCGCTG ATAAGTGGTA CTATTTCAAC GAAGAAGGTG CCATGAAGAC AGGCTGGGTC 

240 1 
AAGTACAAGG ACACTTGGTA CTACTTAGAC GCTAAAGAAG GCGCCATGGT ATCAAATGCC 

300 I 
TTTATCCAGT CAGCGGACGG AACAGGCTGG TACTACCTCA AACCAGACGG AACACTGGCA 



360 



GACAGGCCAG AATTGGCCAG CATGCTGGAC ATGGGCATGT TTCAGGACCC ACAGGAGCGA 
420 I 
CCCAGAAAGT TACCACAGTT ATGCACAGAG CTGCAAACAA CTATACATGA TATAATATTA 

480 I 
GAATGTGTGT ACTGCAAGCA ACAGTTACTG CGACGTGAGG TATATGACTT TGCTTTTCGG 

540 I 
GATTTATGCA TAGTATATAG AGATGGGAAT CCATATGCTG TATGTGATAA ATGTTTAAAG 

600 I 
TTTTATTCTA AAATTAGTGA GTATAGACAT TATTjGTTATA GTTTGTATGG AACAACATTA 

660 ^ 
GAACAGCAAT ACAACAAACC GTTGTGTGAT TTGTTAATTA GGTGTATTAA CTGTCAAAAG 

720 I 
CCACTGTGTC CTGAAGAAAA GCAAAGACAT CTGGACAAAA AGCAAAGATT CCATAATATA 

780 I 
AGGGGTCGGT GGACCGGTCG ATGTATGTCT TCT^GCAGAT CATCAAGAAC ACGTAGAGAA 

840 I 
ACCCAGCTGA CTAGTGGCCA CCATCACCAT CAGCATTAA 

879 I 

(2) INFORMATION FOR SEQ Id|nO:10: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 293 amino acidsf 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 
CLYTA E6 His HPV 16 

(xi) SEQUENCE DESCRIPTION: S€Q ID NO: 10: 

Met Lvs Glv Gly lie Val His Ser Asp Gly Ser Tyr Pro Lys Asp Lys 

15 I 10 15 

Phe Glu Lys lie Asn Gly Thr Trp T^yr Tyr Phe Asp Ser Ser Gly Tyr 

20 2^5 30 

Met Leu Ala Aso Arg Trp Arg Lys His Thr Asp Gly Asn Trp Tyr Trp 

35 ' 40 I 45 

Phe Asp Asn Ser Gly Glu Met Ala Thr Gly Trp Lys Lys He Ala Asp 

50 55 I 60 

Lys Trp Tyr Tyr Phe Asn Glu Glu (ciy Ala Met Lys Thr Gly Trp Val 
65 70 I 75 80 

Lys Tyr Lys Asp Thr Trp Tyr Tyr ^ Leu Asp Ala Lys Glu Gly Ala Met 
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95 



85 90 ^ 

Val Ser Asn Ala Phe lie Gin Ser Ala Asp Sly Thr Giy Trp Tyr Tyr 

100 105 / 110 

Leu Lys Pro Asp Gly Thr Leu Ala Asp Arg pro Glu Leu Ala Ser Met 

115 120 / 125 

Leu Asp Met Ala Met Phe Gin Asp Pro Gln/ciu Arg Pro Arg Lys Leu 

130 135 / 140 

Pro Gin Leu Cys Thr Glu Leu Gin Thr Thr lie His Asp lie lie Leu 
145 150 / 155 160 

Glu Cys Val Tyr Cys Lys Gin Gin Leu LeL Arg Arg Glu Val Tyr Asp 

165 lio 175 

Phe Ala Phe Arg Asp Leu Cys lie Val T^r Arg Asp Gly Asn Pro Tyr 

180 185 / 190 

Ala Val Cys Asp Lys Cys Leu Lys Phe TSyr Ser Lys lie Ser Glu Tyr 

195 200 / 205 

Arg His Tyr Cys Tyr Ser Leu Tyr Gly Thr Thr Leu Glu Gin Gin Tyr 

210 215 I 220 

Asn Lys Pro Leu Cys Asp Leu Leu lie krg Cys lie Asn Cys Gin Lys 
225 230 I 235 240 

Pro Leu Cys Pro Glu Glu Lys Gin Arg | His Leu Asp Lys Lys Gin Arg 

245 1250 255 

Phe His Asn lie Arg Gly Arg Trp Thrf Gly Arg Cys Met Ser Cys Cys 

260 265| 270 

Arg Ser Ser Arg Thr Arg Arg Glu ThJ Gin Leu Thr Ser Gly His His 

275 280 I 285 

His His His His 
290 

(2) INFORMATION FOR SEQ io NO: 11: 

I 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 720 base pair^ 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 
CLYTA E7 HIS HPV 16 

(xi) SEQUENCE DESCRI PTION :| SEQ ID NO: 11: 

ATGAAAGGGG GAATTGTACA TTCAGACGG^ TCTTATCCAA AAGACAAGTT TGAGAAAATC 
60 I 
AATGGCACTT GGTACTACTT TGACAGTTCA GGCTATATGC TTGCAGACCG CTGGAGGAAG 

120 I 
CACACAGACG GCAACTGGTA CTGGTTCGAC AACTCAGGCG AAATGGCTAC AGGCTGGAAG 

180 I 
AAAATCGCTG ATAAGTGGTA CTATTTCA^C GAAGAAGGTG CCATGAAGAC AGGCTGGGTC 

240 I 
AAGTACAAGG ACACTTGGTA CTACTTAGAC GCTAAAGAAG GCGCCATGGT ATCAAATGCC 

300 I 
TTTATCCAGT CAGCGGACGG AACAGGCTGG TACTACCTCA AACCAGACGG AACACTGGCA 

360 I 
GACAGGCCAG AATTGGCCAG CATGCTGC|kC ATGGCCATGC ATGGAGATAC ACCTACATTG 

420 1 
CATGAATATA TGTTAGATTT GCAACCAGAG ACAACTGATC TCTACTGTTA TGAGCAATTA 

480 I 
AATGACAGCT CAGAGGAGGA GGATGAAATA GATGGTCCAG CTGGACAAGC AGAACCGGAC 

540 I 
AGAGCCCATT ACAATATTGT AACCTTTTGT TGCAAGTGTG ACTCTACGCT TCGGTTGTGC 

600 I 
GTACAAAGCA CACACGTAGA CATTCGTACT TTGGAAGACC TGTTAATGGG CACACTAGGA 

660 I 
ATTGTGTGCC CCATCTGTTC TCAGAAA^CA ACTAGTGGCC ACCATCACCA TCACCATTAA 

720 1 



65 



(2) INFORMATION FOR 



|S£Q ID NO: 12 : 



wo 99/33868 I ^ PCT/EP98/08563 

^ \ 

(i) SEQUENCE CHARACTERISTICS: i 

(A) LENGTH: 24 0 amino acids ^ 

(B) TYPE: amino acid 
.(C) STRANDEDNESS : single 

5 (D) TOPOLOGY: linear 

CLYTA E7 HIS HPV 16 ; 

(xi) SEQUENCE DESCRIPTION: SEQilD NO: 12: 

10 Met Lys Gly Gly He Val His Ser Asp Giy Ser Tyr Pro Lys Asp Lys 
15 iO 15 

Phe Glu Lys He Asn Gly Thr Trp Tyr Tyr Phe Asp Ser Ser Gly Tyr 

20 25 I 30 

Met Leu Ala Asp Arg Tro Arg Lys His |.Thr Asp Gly Asn Trp Tyr Trp 
15 35 40 I 45 

Phe Asp Asn Ser Gly Glu Met Ala Thr |Gly Trp Lys Lys He Ala Asp 

50 55 i 60 

Lys Trp Tyr Tyr Phe Asn Glu Glu GiylAla Met Lys Thr Gly Trp Val 
65 70 I 75 80 

20 Lys Tyr Lys Asp Thr Trp Tyr Tyr LeufAsp Ala Lys Glu Gly Ala Met 

85 I 90 95 

Val Ser Asn Ala Phe He Gin Ser Alaf Asp Gly Thr Gly Trp Tyr Tyr 

100 105] 110 

Leu Lys Pro Asp Gly Thr Leu Ala Asp Arg Pro Glu Leu Ala Ser Met 
25 115 120 I 125 

Leu Asp Met Ala Met His Gly Asp Thr Pro Thr Leu His Glu Tyr Met 

130 135 I 140 

Leu Asp Leu Gin Pro Glu Thr Thr Asp Leu Tyr Cys Tyr Glu Gin Leu 
145 150 I 155 160 

30 Asn Asp Ser Ser Glu Glu Glu Asp g|u He Asp Gly Pro Ala Gly Gin 

165 I 170 175 

Ala Glu Pro Asp Arg Ala His Tyr Asn He Val Thr Phe Cys Cys Lys 

180 l{85 190 

Cys Asp Ser Thr Leu Arg Leu Cys \|al Gin Ser Thr His Val Asp He 
35 195 . 200 1 205 

Arg Thr Leu Glu Asp Leu Leu Met Gly Thr Leu Gly He Val Cys Pro 

210 215 I 220 

He Cys Ser Gin Lys Pro Thr Ser fely His His His His His His 
225 230 I 235 



40 

(2) INFORMATION FOR SE(2 ID NO: 13: 

I 

(i) SEQUENCE CHARACTERIST|:CS : 
(A) LENGTH: 1173 base Flairs 
45 (B) TYPE: nucleic acid | 

(C) STRANDEDNESS: sing|e 

(D) TOPOLOGY: linear | 
CLYTA E6E7 His 

50 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 13: 



ATGAAAGGGG GAATTGTACA TTCAGACGGC TCTTATCCAA AAGACAAGTT TGAGAAAATC 

60 I 
AATGGCACTT GGTACTACTT TGACAGT^TCA GGCTATATGC TTGCAGACCG CTGGAGGAAG 

55 120 I 

CACACAGACG GCAACTGGTA CTGGTTGGAC AACTCAGGCG AAATGGCTAC AGGCTGGAAG 

180 I 
AAAATCGCTG ATAAGTGGTA CTATTT^AAC GAAGAAGGTG CCATGAAGAC AGGCTGGGTC 

240 I 
60 AAGTACAAGG ACACTTGGTA CTACTTAGAC GCTAAAGAAG GCGCCATGGT ATCAAATGCC 

300 I 
TTTATCCAGT CAGCGGACGG AACAGGCTGG TACTACCTCA AACCAGACGG AACACTGGCA 

360 I 
GACAGGCCAG AATTGGCCAG CATGC^GGAC ATGGCCATGT TTCAGGACCC ACAGGAGCGA 

65 4 20 
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GTTATA GTTTGTATGG AACAACATTA 



10 



15 



20 



25 



30 



35 



40 



45 



50 



55 



60 



65 



CCCAGAAAGT TACCACAGTT ATGCACAGAG CTGOAAACAA CTATACATGA TATAATATTA 
480 

GAATGTGTGT ACTGCAAGCA ACAGTTACTG CGAGGTGAGG TATATGACTT TGCTTTTCGG 
540 

GATTTATGCA TAGTATATAG AGATGGGAAT CCATATGCTG TATGTGATAA ATGTTTAAAG 
600 

TTTTATTCTA AAATTAGTGA GTATAGACAT TAT' 
660 

GAACAGCAAT ACAACAAACC GTTGTGTGAT TTGTTAATTA GGTGTATTAA CTGTCAAAAG 
720 1 
CCACTGTGTC CTGAAGAAAA GCAAAGACAT CTGGACAAAA AGCAAAGATT CCATAATATA 

780 \ 

AGGGGTCGGT GGACCGGTCG ATGTATGTCT TGTTGCAGAT CATCAAGAAC ACGTAGAGAA 
840 \ 

ACCCAGCTGA TGCATGGAGA TACACCTACA TTGCATGAAT ATATGTTAGA TTTGCAACCA 

900 \ 

GAGACAACTG ATCTCTACTG TTATGAGCAA TTAAATGACA GCTCAGAGGA GGAGGATGAA 
960 I 

ATAGATGGTC CAGCTGGACA AGCAGAACCG GACAQAGCCC ATTACAATAT TGTAACCTTT 
1020 J 

TGTTGCAAGT GTGACTCTAC GCTTCGGTTG TGCGmCAAA GCACACACGT AGACATTCGT 
1080 I 

ACTTTGGAAG ACCTGTTAAT GGGCACACTA GGAATfTGTGT GCCCCATCTG TTCTCAGAAA 
1140 

CCAACTAGTG GCCACCATCA CCATCACCAT TAA 
1173 

(2) INFORMATION FOR SEQ ID NC|:14: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 3 91 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 
CLYTA E6E7 His HPV16 

(xi) SEQUENCE DESCRIPTION: SEQ |lD NO: 14 

Met Lys Glv Gly lie Val His Ser Asp Gly Ser Tyr Pro Lys Asp Lys 

1 5 40 15 

Phe Glu Lys lie Asn Gly Thr Trp Tyr fyr Phe Asp Ser Ser Gly Tyr 

20 25 I 30 

Met Leu Ala Asp Arg Trp Arg Lys His |rhr Asp Gly Asn Trp Tyr Trp 

35 40 I 45 

Phe Asp Asn Ser Gly Glu Met Ala Thrfciy Trp Lys Lys lie Ala Asp 

50 55 I 60 

Lys Trp Tyr Tyr Phe Asn Glu Glu GlylAla Met Lys Thr Gly Trp Val 
65 70 I 75 80 

Lys Tyr Lys Asp Thr Trp Tyr Tyr Lei| Asp Ala Lys Glu Gly Ala Met 

85 I 90 95 

Val Ser Asn Ala Phe He Gin Ser Al| Asp Gly Thr Gly Trp Tyr Tyr 

100 lOl 110 

Leu Lys Pro Asp Gly Thr Leu Ala Asp Arg Pro Glu Leu Ala Ser Met 

115 120 I 125 

Leu Asp Met Ala Met Phe Gin Asp Pco Gin Glu Arg Pro Arg Lys Leu 

130 135 I 140 

Pro Gin Leu Cys Thr Glu Leu Gin Thr Thr He His Asp He He Leu 
145 150 I 155 160 

Glu Cys Val Tyr Cys Lys Gin Gin Heu Leu Arg Arg Glu Val Tyr Asp 

165 I 170 175 

Phe Ala Phe Arg Asp Leu Cys He ^al Tyr Arg Asp Gly Asn Pro Tyr 

130 fl85 190 

Ala Val Cys Asp Lys Cys Leu Lys fphe Tyr Ser Lys He Ser Glu Tyr 

195 200| 205 

Arg His Tyr Cys Tyr Ser Leu TyrlGly Thr Thr Leu Glu Gin Gin Tyr 
210 215 I 220 



I 
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225 230 
Pro Leu Cys Pro Glu Glu Lys Gin 
245 

5 Phe His Asn He Arg Gly Arg Trp 
260 

Arg Ser Ser Arg Thr Arg Arg Giu 
275 280 
Pro Thr Leu His Glu Tyr Met Leu 
10 290 295 

Leu Tyr Cys Tyr Glu Gin Leu Asn 
305 310 
He AsD Gly Pro Ala Gly Gin Ala 
325 

He Val Thr Phe Cys Cys Lys Cys 
340 

Gin Ser Thr His Val Asp He Arg 
355 360 



15 



25 



35 



60 
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He 


Arc 
[ 


Cys 
235 


He 


Asn 


Cys 


Gin 


Lys 
240 


Arg 


Hife 


Leu 


Asp 


Lys 


Lys 


Gin 


Arg 


25p 










255 




Thr 


Gi'y 


Arg 


Cys 


Met 


Ser 


Cys 


Cys 


265 










270 






Thr 


Gin 
/ 


Leu 


Met 


His 
285 


Gly 


Asp 


Thr 


Asp 


lieu 


Gin 


Pro 


Glu 


Thr 


Thr 


Asp 






300 










Asp 


Ser 


Ser 


Glu 


Glu 


Glu 


Asp 


Glu 




315 










"3 o n 

320 


Giu 


/Pro 
/330 


Asp 


Arg 


Ala 


His 


Tyr 
335 


Asn 


Asp 


1 Ser 


Thr 


Leu 


Arg 


Leu 


Cys 


Val 


345] 










350 






Thi| 


Leu 


Glu 


Asp 


Leu 
365 


Leu 


Met 


Gly 


cvl 


Ser 


Gin 


Lys 


Pro 


Thr 


Ser 


Gly 



20 370 375 I 380 

His His His His His His 
385 390 



(2) INFORMATION FOR SEQ IID NO:15: 



(i) SEQUENCE CHARACTERISTICS; 

(A) LENGTH: 684 base pai|:s 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 
30 (D) TOPOLOGY: linear , 

Protein D 1/3 E7 hisfHPV 18 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 15: 

ATGGATCCAA GCAGCCATTC ATCAAATaIg GCGAATACCC AAATGAAATC AGACAAAATC 

60 I 
ATTATTGCTC ACCGTGGTGC TAGCGGTmT TTACCAGAGC ATACGTTAGA ATCTAAAGCA 

40 ^CTTGCGTTTG CACAACAGGC TGATTATTTA GAGCAAGATT TAGCAATGAC TAAGGATGGT 

180 

CGTTTAGTGG TTATTCACGA TCACTTt|:TA GATGGCTTGA CTGATGTTGC GAAAAAATTC 

^CCACATCGTC ATCGTAAAGA TGGCCG-BTAC TATGTCATCG ACTTTACCTT AAAAGAAATT 

45 300 f 

CAAAGTTTAG AAATGACAGA AAACTTTGAA ACCATGGCCA TGCATGGACC TAAGGCAACA 

I 

TTGCAAGACA TTGTATTGCA TTTAGAGCCC CAAAATGAAA TTCCGGTTGA CCTTCTATGT 

420 I 
50 CACGAGCAAT TAAGCGACTC AGhGGJf^GPJK AACGATGAAA TAGATGAAGT TAATCATCAA 

480 1 
CATTTACCAG CCCGACGAGC CGAACGACAA CGTCACACAA TGTTGTGTAT GTGTTGTAAG 

^TGTGAAGCCA GAATTGAGCT AGTAGTAGAA AGCTCAGCAG ACGACCTTCG AGCATTCCAG 

^CAGCTGTTTC TGAACACCCT GTCCt|tGTG TGTCCGTGGT GTGCATCCCA GCAGACTAGT 
660 

GGCCACCATC ACCATCACCA TTAA 
684 



(2) INFORMATION FO> SEQ ID NO: 16: 



(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 228 am|no acids 
65 (B) TYPE: amino ac|d 

(C) STRANDEDNESS: single 
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10 



15 



20 



25 



30 



35 



40 



45 



50 



55 



60 



65 



(D) TOPOLOGY: linear 

Protein D 1/3 E7 his HPV 18 

(xi) SEQUENCE DESCRIPTION: SEQ IIJ NO: 16: 

Met Asp Pro Ser Ser His Ser Ser Asn Mel Ala Asn 

1 5 lOf 

Ser Asp Lys He lie He Ala His Arg Gly Ala Ser 

20 25 ] 

Glu His Thr Leu Giu Ser Lys Ala Leu Alk Phe Ala 

35 40 I 

Tyr Leu Glu Gin Asp Leu Ala Met Thr Lys Asp Gly 

50 S5 I 60 

He His Asp His Phe Leu Asp Gly Leu Thl: Asp Val 
65 70 I 75 

Pro His Arg His Arg Lys Asp Gly Arg Tyr Tyr Val 

85 90| 
Leu Lys Glu He Gin Ser Leu Glu Met Thr Glu Asn 

100 105 I 

Ala Met His Gly Pro Lys Ala Thr Leu Gin Asp He 

115 120 I 

Glu Pro Gin Asn Glu He Pro Val Asp LeL Leu Cys 
130 135 1 140 

Ser Asp Ser Giu Glu Glu Asn Asp Glu Ij|e Asp Glu 
145 150 I 155 

His Leu Pro Ala Arg Arg Ala Glu Pro Gin Arg His 

165 IfO 
Met Cys Cys Lys Cys Glu Ala Arg He GfLu Leu Val 

185 I 
Gin '^^^ T^^' 
195 200 



180 185 » 

Ala Asp Asp Leu Arg Ala Phe Gin Gin lieu Phe Leu 

195 200 I 

Phe Val Cys Pro Trp Cys Ala Ser Gin Gin Thr Ser 
210 215 I 220 

His His His 
225 

(2) INFORMATION FOR SEQ ID|NO:17: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 110 amino acicis| 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 
Thioredoxin 

(xi) SEQUENCE DESCRIPTION: SE^ ID NO: 17 



Thr Gin 

Gly Tyr 
30 

Gin Gin 
45 

Arg Leu 

Ala Lys 

He Asp 

Phe Glu 
110 
Val Leu 
125 

His Glu 

Val Asn 

Thr Met 

Val Glu 
190 
Asn Thr 
205 

Gly His 



Met Lys 
15 

Leu Pro 

Ala Asp 

Val Val 

Lys Phe 

80 
Phe Thr 
95 

Thr Met 

His Leu 

Gin Leu 

His Gin 
160 
Leu Cys 
175 

Ser Ser 
Leu Ser 
His His 



Met Ser Asp Lys He He His Leu The Asp Asp Ser 

Val Leu Lys Ala Asp Gly Ala He Leu Val Asp Phe 

20 25| 
Cys Gly Pro Cys Lys Met He Ala Pro He Leu Asp 

35 40 I 

Glu Tyr Gin Gly Lys Leu Thr Val Ala Lys Leu Asn 

50 55 I 60 

Pro Gly Thr Ala Pro Lys Tyr Gly life Arg Gly He 
65 70 I 75 

Leu Phe Lys Asn Giy Giu Val Ala A|a Thr Lys Val 

85 1 90 

Lys Gly Gin Leu Lys Giu Phe Leu Asp Ala Asn Leu 

100 ips 

I 

(2) INFORMATION FOR SEQ |lD NO:18: 

(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 684 base pairs 



Phe Asp Thr Asp 
15 

Trp Ala Glu Trp 
30 

Giu He Ala Asp 
45 

He Asp Gin Asn 

Pro Thr Leu Leu 
80 

Giy Ala Leu Ser 
95 

Ala 
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(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 
Protein D 1/3 E7 mutated HPV/18 

^ (xi) SEQUENCE DESCRIPTION: SEQ Id|nO:18: 

ATGGATCCAA GCAGCCATTC ATCAAATATG GCGAATACCC AAATGAAATC AGACAAAATC 

60 I 
10 ATTATTGCTC ACCGTGGTGC TAGCGGTTAT TTACCAGAGC ATACGTTAGA ATCTAAAGCA 

120 J 
CTTGCGTTTG CACAACAGGC TGATTATTTA GAGCAAGATT TAGCAATGAC TAAGGATGGT 

180 I 
CGTTTAGTGG TTATTCACGA TCACTTTTTA GATGGCTTGA CTGATGTTGC GAAAAAATTC 

15 240 I 

CCACATCGTC ATCGTAAAGA TGGCCGTTAC TATGTCATCG ACTTTACCTT AAAAGAAATT 

300 I 
CAAAGTTTAG AAATGACAGA AAACTTTGAA ACCATGGCCA TGCATGGACC TAAGGCAACA 

360 I 
20 TTGCAAGACA TTGTATTGCA TTTAGAGCCC CAAAATGAAA TTCCGGTTGA CCTTCTAGGT 

420 j 
CACCAGCAAT TAAGCGACTC AGAGGAAGAA AACdATGAAA TAGATGGAGT TAATCATCAA 

"^CATTTACCAG CCCGACGAGC CGAACCACAA CGTCACACAA TGTTGTGTAT GTGTTGTAAG 

TGTGAAGCCA GAATTGAGCT AGTAGTAGAA AGCjTCAGCAG ACGACCTTCG AGCATTCCAG 

600 I 
CAGCTGTTTC TGAACACCCT GTCCTTTGTG TG^CCGTCGT GTGCATCCCA GCAGACTAGT 

660 

30 GGCCACCATC ACCATCACCA TTAA 
684 

(2) INFORMATION EOR SEQ lrfN0:19; 

I 

35 (i) SEQUENCE CHARACTERISTICS! 

(A) LENGTH: 228 amino acidi 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 
40 Protein D 1/3 E7 mutatdd HPV 18 

(xi) SEQUENCE DESCRIPTION: IeQ ID NO: 19: 

I 

45 Met Asp Pro Ser Ser His Ser Ser ;|sn Met Ala Asn Thr Gin Met Lys 

1 5 1 

Ser Asp Lys He He He Ala His Arg Gly Ala Ser Gly Tyr Leu Pro 

20 ?5 30 

Glu His Thr Leu Glu Ser Lys Ala |eu Ala Phe Ala Gin Gin Ala Asp 
50 35 40 f 45 

Tyr Leu Glu Gin Asp Leu Ala Met fhr Lys Asp Gly Arg Leu Val Val 

50 55 f 60 

He His Asp His Phe Leu Asp Gly Leu Thr Asp Val Ala Lys Lys Phe 
65 I 75 80 

55 Pro His Arg His Arg Lys Asp GlySArg Tyr Tyr Val He Asp Phe Thr 

85 I 90 95 

Leu Lys Glu He Gin Ser Leu GlufMet Thr Glu Asn Phe Glu Thr Met 

100 il05 110 

Ala Met His Gly Pro Lys Ala Thrf Leu Gin Asp He Val Leu His Leu 
60 115 120| 125 

Glu Pro Gin Asn Glu He Pro Vail Asp Leu Leu Gly His Gin Gin Leu 

130 135 I 140 

Ser Asp Ser Glu Glu Glu Asn Astf Glu He Asp Gly Val Asn His Gin 
145 150 I 155 160 

65 His Leu Pro Ala Arg Arg Ala Glu^ Pro Gin Arg His Thr Met Leu Cys 

165 f 1*70 175 




wo 99/33868 / PCT/EP98/08563 

14 

Met Cys Cys Lys Cys Giu Aia Arg lie Gllx Leu Val Val Glu Ser Ser 

180 185 / 190 

Ala Asp Asp Leu Arg Ala Phe Gin Gin Leu Phe Leu Asn Thr Leu Ser 
195 200 / 205 

5 Phe Val Cys Pro Trp Cys Ala Ser Gin Gin Thr Ser Gly His His His 
210 215 / 220 

His His His 
225 

10 (2) INFORMATION FOR SEQ I 

(i) SEQUENCE CHARACTERISTICS; 

(A) LENGTH: 837 base pairs | 

(B) TYPE: nucleic acid 
15 (C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

Protein D 1/3 E6 - His| HPV 18 

S 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:20: 

20 I 

ATGGATCCAA GCAGCCATTC ATCAAATATG SCGAATACCC AAATGAAATC AGACAAAATC 

60 I 
ATTATTGCTC ACCGTGGTGC TAGCGGTTAT |rTACCAGAGC ATACGTTAGA ATCTAAAGCA 

120 I 
25 CTTGCGTTTG CACAACAGGC TGATTATTTAIgAGCAAGATT TAGCAATGAC TAAGGATGGT 

180 I 
CGTTTAGTGG TTATTCACGA TCACTTTTTAf GATGGCTTGA CTGATGTTGC GAAAAAATTC 

240 I 
CCACATCGTC ATCGTAAAGA TGGCCGTTAG TATGTCATCG ACTTTACCTT AAAAGAAATT 

30 300 I 

CAAAGTTTAG AAATGACAGA AAACTTTGAA ACCATGGCGC GCTTTGAGGA TCCAACACGG 

360 I 
CGACCCTACA AGCTACCTGA TCTGTGCACG GAACTGAACA CTTCACTGCA AGACATAGAA 

420 I 
35 ATAACCTGTG TATATTGCAA GACAGTATTG GAACTTACAG AGGTATTTGA ATTTGCATTT 

480 I 
AAAGATTTAT TTGTGGTGTA TAGAGACAGT ATACCGCATG CTGCATGCCA TAAATGTATA 

540 I 
GATTTTTATT CTAGAATTAG AGAATTAAGA CATTATTCAG ACTCTGTGTA TGGAGACACA 

40 600 I 

TTGGAAAAAC TAACTAACAC TGGGTTATAC AATTTATTAA TAAGGTGCCT GCGGTGCCAG 

660 I 
AAACCGTTGA ATCCAGCAGA AAAACTTAGA CACCTTAATG AAAAACGACG ATTTCACAAC 

720 I 
45 ATAGCTGGGC ACTATAGAGG CCAGTGGpAT TCGTGCTGCA ACCGAGCACG ACAGGAACGA 

780 k 
CTCCAACGAC GCAGAGAAAC ACAAGT^ACT AGTGGCCACC ATCACCATCA CCATTAA 

837 I 
50 (2) INFORMATION FOrI SEQ ID NO: 21: 

i 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 27 9 amin^ acids 

(B) TYPE: amino acid' 

55 (C) STRANDEDNESS: single 

CD) TOPOLOGY: linear 

Protein D 1/3 e| - His HPV 18 

I 

(xi) SEQUENCE DESCRIPvtlON : SEQ ID NO: 21: 



60 1 

Met Asp Pro Ser Ser His Serl Ser Asn Met Aia Asn Thr Gin Met Lys 

1 5 ^ 10 15 

Ser Asp Lys lie He He Ala^ His Arg Gly Ala Ser Gly Tyr Leu Pro 

20 i 25 30 

65 Glu His Thr Leu Glu Ser Ly^ Ala Leu Aia Phe Ala Gin Gin Ala Asp 

35 • 40 45 
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Tyr Leu Glu Gin Asp Leu Ala Met Thr Lys Asp Gly Arg Leu Vai Val 

50 55 I 60 

lie His AsD His Phe Leu Asp Giy Leu Th?r Asp Vai Aia Lys Lys Phe 
65 ' 10 I 75 80 

5 Pro His Arg His Arg Lys Asp Giy Arg TJ^r Tyr Vai lie Asp Phe Thr 

8 5 9p 95 

Leu Lys Giu lie Gin Ser Leu Glu Met Thr Glu Asn Phe Glu Thr Met 

100 105 I 110 

Ala Arg Phe Glu Asp Pro Thr Arg Arg Pro Tyr Lys Leu Pro Asp Leu 
10 115 120 I 125 

Cys Thr Glu Leu Asn Thr Ser Leu Gin Asp lie Giu lie Thr Cys Vai 

130 135 I 140 

Tyr Cys Lys Thr Val Leu Glu Leu Thr fclu Vai Phe Glu Phe Ala Phe 
145 150 I 155 160 

15 Lys Asp Leu Phe Val Val Tyr Arg AsprSer lie Pro His Aia Aia Cys 

165 |l70 175 

His Lys Cys lie Asp Phe Tyr Ser Argj lie Arg Glu Leu Arg His Tyr 

180 18| 190 

Ser Asp Ser Val Tyr Gly Asp Thr Leu Giu Lys Leu Thr Asn Thr Gly 
20 195 200 I 205 

Leu Tyr Asn Leu Leu lie Arg Cys Leu Arg Cys Gin Lys Pro Leu Asn 

210 215 I 220 

Pro Ala Giu Lys Leu Arg His Leu As'n Glu Lys Arg Arg Phe His Asn 
225 230 I 235 240 

25 lie Ala Giy His Tyr Arg Giy Gin c|s His Ser Cys Cys Asn Arg Aia 

245 I 250 255 

Arg Gin Glu Arg Leu Gin Arg Arg Arg Giu Thr Gin Val Thr Ser Giy 

260 265 270 

His His His His His His jj 

30 27 5 f 

I 

(2) INFORMATION FOR SEQ ID NO: 22: 

I 

(i) SEQUENCE CHARACTERISTICS: 
35 (A) LENGTH: 1152 base pairs 

(B) TYPE: nucleic acLd § 
iC) STRANDEDNESS: singl'e 
(D) TOPOLOGY: linear | 

40 Protein Dl/3 E6 E7| His/ HPV 18 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 22: 

I 

ATGGATCCAA GCAGCCATTC ATCAAATATG GCGAATACCC AAATGAAATC AGACAAAATC 
60 I 
45 ATTATTGCTC ACCGTGGTGC TAGCGGTTAT TTACCAGAGC ATACGTTAGA ATCTAAAGCA 

120 I 
CTTGCGTTTG CACAACAGGC TGATTAT|TTA GAGCAAGATT TAGCAATGAC TAAGGATGGT 

180 i 

CGTTTAGTGG TTATTCACGA TCACTTTTTA GATGGCTTGA CTGATGTTGC GAAAAAATTC 

50 240 I 

CCACATCGTC ATCGTAAAGA TGGCCGTTAC TATGTCATCG ACTTTACCTT AAAAGAAATT 

300 ,) 
CAAAGTTTAG AAATGACAGA AAACTTTGAA ACCATGGCGC GCTTTGAGGA TCCAACACGG 

360 i 
55 CGACCCTACA AGCTACCTGA TCTGT(3CACG GAACTGAACA CTTCACTGCA AGACATAGAA 

420 1? 
ATAACCTGTG TATATTGCAA GACAGTATTG GAACTTACAG AGGTATTTGA ATTTGCATTT 

480 l< 
AAAGATTTAT TTGTGGTGTA TAGAGACAGT ATACCGCATG CTGCATGCCA TAAATGTATA 

60 540 I 

GATTTTTATT CTAGAATTAG AGAATTAAGA CATTATTCAG ACTCTGTGTA TGGAGACACA 

600 I 
TTGGAAAAAC TAACTAACAC TGGGTTATAC AATTTATTAA TAAGGTGCCT GCGGTGCCAG 

660 ^ 
65 AAACCGTTGA ATCCAGCAGA AAAACTTAGA CACCTTAATG AAAAACGACG ATTTCACAAC 

720 ii 

t 
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ATAGCTGGGC ACTATAGAGG CCAGTGCCAT TCGTGCTGCA ACCGAGCACG ACAGGAACGA 
780 I 
CTCCAACGAC GCAGAGAAAC ACAAGTAATG CATGC^CCTA AGGCAACATT GCAAGACATT 

840 I 
5 GTATTGCATT TAGAGCCCCA AAATGAAATT CCGGTTGACC TTCTATGTCA CGAGCAATTA 

900 I 
AGCGACTCAG AGGAAGAAAA CGATGAAATA GATG|AGTTA ATCATCAACA TTTACCAGCC 

960 I 
CGACGAGCCG AACCACAACG TCACACAATG TTGTGTATGT GTTGTAAGTG TGAAGCCAGA 
10 1020 I 

ATTGAGCTAG TAGTAGAAAG CTCAGCAGAC GACqTTCGAG CATTCCAGCA GCTGTTTCTG 
1080 ? 
AACACCCTGT CCTTTGTGTG TCCGTGGTGT GCATCCCAGC AGACTAGTGG CCACCATCAC 

1140 f 
15 CATCACCATT AA ^ 
1152 I 

(2) INFORMATION FOR SEQ ID^NO:23: 

I 

20 (i) SEQUENCE CHARACTERISTICS : || 

(A) LENGTH: 384 amino acids jj 

(B) TYPE: amino acid g 

(C) STRANDEDNESS : single I 

(D) TOPOLOGY: linear | 
25 Protein Dl/3 E6 E7 His/| HPV 18 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 23: 

f 

Met Asp Pro Ser Ser His Ser Ser Asn Met Ala Asn Thr Gin Met Lys 
30 1 5 f 10 15 

Ser Asp Lys He He He Ala His Ara Gly Ala Ser Gly Tyr Leu Pro 

20 25| 30 

Glu His Thr Leu Glu Ser Lys Ala LeU Ala Phe Ala Gin Gin Ala Asp 
35 40 I 45 

35 Tyr Leu Glu Gin Asp Leu Ala Met Thr Lys Asp Gly Arg Leu Val Val 
50 55 i' 60 

He His Asp His Phe Leu Asp Gly Le'u Thr Asp Val Ala Lys Lys Phe 
65 70 |j 75 80 

Pro His Arg His Arg Lys Asp Gly Arg Tyr Tyr Val He Asp Phe Thr 
40 85 1 90 95 

Leu Lys Glu He Gin Ser Leu Glu Met Thr Glu Asn Phe Glu Thr Met 

100 105 110 

Ala Arg Phe Glu Asp Pro Thr Arg Airg Pro Tyr Lys Leu Pro Asp Leu 
115 120 I 125 

45 Cys Thr Glu Leu Asn Thr Ser Leu Gin Asp He Glu He Thr Cys Val 
130 135 140 

Tvr Cys Lys Thr Val Leu Glu Leu Tihr Glu Val Phe Glu Phe Ala Phe 
145 150 I 155 160 

Lvs Asp Leu Phe Val Val Tyr Arg Asp Ser He Pro His Ala Ala Cys 
50 165 170 175 

His Lys Cys He Asp Phe Tyr Ser 4^rg He Arg Glu Leu Arg His Tyr 

180 |85 190 

Ser Asp Ser Val Tyr Gly Asp Thr Leu Glu Lys Leu Thr Asn Thr Gly 
195 200 1 205 

55 Leu Tyr Asn Leu Leu He Arg Cys Leu Arg Cys Gin Lys Pro Leu Asn 
210 215 I 220 

Pro Ala Glu Lys Leu Arg His Leu Asn Glu Lys Arg Arg Phe His Asn 
225 230 |5 235 240 

He Ala Gly His Tyr Arg Gly Gin Gys His Ser Cys Cys Asn Arg Ala 
60 245 ^ 250 255 

Arq Gin Glu Arg Leu Gin Arg Arg f^rq Glu Thr Gin Val Met His Gly 

260 265 270 

Pro Lys Ala Thr Leu Gin Asp He IVal Leu His Leu Glu Pro Gin Asn 
275 280 h 285 

65 Glu He Pro Val Asp Leu Leu Cvs His Glu Gin Leu Ser Asp Ser Glu 
290 295 ^ 300 



wo 99/33868 



17 



PCT/EP98/08563 



10 



Giu Glu Asn Asp Giu lie Asp Gly Val Asn His Gin His Leu Pro Ala 

305 310 / 315 * 320 

Arg Arg Ala Glu Pro Gin Arg His Thr -bflet Leu Cys Met Cys Cys Lys 

325 330 335 

Cys Glu Ala Arg He Glu Leu Val Val /Glu Ser Ser Ala Asp Asp Leu 

340 345/ 350 

Arg Ala Phe Gin Gin Leu Phe Leu Asn/Thr Leu Ser Phe Val Cys Pro 

355 360 / 365 

Trp Cys Ala Ser Gin Gin Thr Ser Glv His His His His His His 

370 375 j 380 



